c++ jsoncpp中文和\uXXXX使用toStyledString生成字符串中文乱码解决方案 | 您所在的位置:网站首页 › std string 中文编码 › c++ jsoncpp中文和\uXXXX使用toStyledString生成字符串中文乱码解决方案 |
目录 一、中文乱码解决方法 1.1、乱码展示 1.2、乱码原因及解决方法 二、含有\uXXXX解析乱码的解决方法 2.1、乱码展示 2.2、乱码原因 2.3、解决方法 一、中文乱码解决方法 1.1、乱码展示在使用jsoncpp解析含有中文的字符串的时候,使用toStyledString()函数生成的字符串中的中文部分将变成\u加4个16进制数字会出现解析乱码的情况。 比如: jsoncpp的源码来分析(官方下载地址:http://sourceforge.net/projects/jsoncpp/files/ )。通过分析StyledWriter的writeValue函数发现他对字符串的处理通过valueToQuotedStringN函数进行了转义: static String valueToQuotedStringN(const char* value, unsigned length) { if (value == nullptr) return ""; if (!isAnyCharRequiredQuoting(value, length)) return String("\"") + value + "\""; // We have to walk value and escape any special characters. // Appending to String is not efficient, but this should be rare. // (Note: forward slashes are *not* rare, but I am not escaping them.) String::size_type maxsize = length * 2 + 3; // allescaped+quotes+NULL String result; result.reserve(maxsize); // to avoid lots of mallocs result += "\""; char const* end = value + length; for (const char* c = value; c != end; ++c) { switch (*c) { case '\"': result += "\\\""; break; case '\\': result += "\\\\"; break; case '\b': result += "\\b"; break; case '\f': result += "\\f"; break; case '\n': result += "\\n"; break; case '\r': result += "\\r"; break; case '\t': result += "\\t"; break; // case '/': // Even though \/ is considered a legal escape in JSON, a bare // slash is also legal, so I see no reason to escape it. // (I hope I am not misunderstanding something.) // blep notes: actually escaping \/ may be useful in javascript to avoid = 0x20) result += static_cast(cp); else if (cp < 0x10000) { // codepoint is in Basic Multilingual Plane result += "\\u"; result += toHex16Bit(cp); } else { // codepoint is not in Basic Multilingual Plane // convert to surrogate pair first cp -= 0x10000; result += "\\u"; result += toHex16Bit((cp >> 10) + 0xD800); result += "\\u"; result += toHex16Bit((cp & 0x3FF) + 0xDC00); } //result += *c; }break;改为: default: { result += *c; }break;最终结果为: 参考链接: c++ jsoncpp使用toStyledString生成字符串中文乱码解决方案 二、含有\uXXXX解析乱码的解决方法 2.1、乱码展示json文件如下: 解析结果: 之前改过valueToQuotedStringN函数,这个函数是将字符串转化为unicode编码,所以直接读取\uXXXX格式的字符串得到的其实是utf-8的字符串(如果读的是中文才是unicode编码)。所以这里需要额外的将字符串转化为unicode代码 2.3、解决方法utf-8转unicode: wstring UTF8ToUnicode(const string& str) { int len = 0; len = str.length(); int unicodeLen = ::MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, NULL, 0); wchar_t * pUnicode; pUnicode = new wchar_t[unicodeLen + 1]; memset(pUnicode, 0, (unicodeLen + 1) * sizeof(wchar_t)); ::MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, (LPWSTR)pUnicode, unicodeLen); wstring rt; rt = (wchar_t*)pUnicode; delete pUnicode; return rt; }在程序中加入该函数,并调用: std::string ws2s(const std::wstring& ws) { std::string curLocale = setlocale(LC_ALL, NULL); setlocale(LC_ALL, "chs"); const wchar_t* _Source = ws.c_str(); size_t _Dsize = 2 * ws.size() + 1; char *_Dest = new char[_Dsize]; memset(_Dest, 0, _Dsize); wcstombs(_Dest, _Source, _Dsize); std::string result = _Dest; delete[]_Dest; setlocale(LC_ALL, curLocale.c_str()); return result; } //调用 std::string content = root["Cnki"][i]["content"].toStyledString(); wstring wstr = UTF8ToUnicode(content);//将utf-8转化为unicode格式 cout |
CopyRight 2018-2019 实验室设备网 版权所有 |